On the functional properties of microRNA-mediated feed forward loops
نویسنده
چکیده
Motivation: Recent studies of genomic-scale regulatory networks suggested that a feed-forward loop (FFL) circuitry is a key component of many such networks. This led to a study of the functional properties of different FFL types, where the regulatory elements are transcription factors. Results: Here we investigate these properties when the mediating regulatory element of the loop is a microRNA. We find that many of the FFL properties are enhanced within this setup. We then proceed to identify all such FFLs in the D. Melanogaster regulatory network. We observe that in FFLs rooted at the same transcription factor there are significant correlations between the number of predicted binding sites for the transcription factor and for the microRNA. Conclusions: Based on a modeling approach we suggest that these correlations may be an outcome of the type of FFL preferred by the transcription factor. This may help elucidate the type of regulation the TF confers. Background MicroRNAs (miRNAs) are short, highly conserved, endogenous RNA molecules (20-25bp long), that regulate protein expression by selectively binding to mRNA transcripts. Through recruitment of the RISC protein complex, they inhibit translation or initiate RNA cleavage. In recent years, the importance of miRNAs has become apparent, and it is estimated that they consist of at least 1% of the genes in Metazoa genomes, and that many of them target thousands of transcripts (reviewed, e.g. in (Bartel, 2004; Ke, et al., 2003)). However, despite their abundance and plausible importance, only little is known about their cellular roles. Initially, miRNA were considered to have widespread essential roles, especially in cellular fate specification (see e.g. (Bartel, 2004; Ke, et al., 2003)). This view was supported by their tight conservation and the abundance of their targets. However, this perception was somewhat altered by several studies ((Giraldez, et al., 2005; Hornstein and Shomron, 2006; Stark, et al., 2005; Ying and Lin, 2005)). For example, Giraldez et al. (Giraldez, et al., 2005) showed that zebrafish embryos whose miRNA processing mechanism has been knocked out undergo surprisingly accurate axis formation and cell differentiation, leading them to suggest a modulating or tissue-specific role for miRNAs. Stark et al. (Stark, et al., 2005) showed that the expression of many miRNAs tends to be mutually exclusive with that of their target transcripts, suggesting a main role for mRNA in preventing translation of leaky basal transcription. Hornstein and Shomron (Hornstein and Shomron, 2006) proposed that this modulating role is achieved by the wiring of miRNA within the cellular regulatory network. Specifically, they suggested that the expression of a miRNA will be regulated by a transcription factor (TF), which also jointly regulate with it the expression of a protein-coding gene, forming a feed forward loop. For example, if this TF activates the miRNA and the target-gene transcription , while the miRNA blocks the target-gene translation, the resulting feed forward loop (so called type-1 incoherent feed forward loop (Alon, 2007; Alonso and McKane, 2002); Figure 1) may confer robustness of the target-gene expression to variability in the TF levels (Alon, 2007). Feed forward loops (FFLs) have been shown to be a major component of biological networks, and their functionality within the context of expression regulation has been analyzed both theoretically (Ghosh, et al., 2005; Mangan and Alon, 2003; Mangan, et al., 2003) and experimentally (Mangan, et al., 2006). These studies focused on FFLs in which both regulatory components are TFs, such that expression regulation is controlled at the transcriptional level. In this work we compare the functionality of such FFLs with the architecture suggested by Hornstein and Shomron, where the mediating regulator in the FFL is a miRNA (Figure 1). We find that such FFLs generally facilitate previously suggested functions more readily. However, this facilitation comes at the cost of higher noise level of the protein product. To examine the properties of such miRNA-FFLs in a real network, we construct a large-scale sequencebased enumeration of them in Drosophila melanogaster. We investigate the strength of the regulatory connections in FFLs, a parameter that can be deduced from sequence analysis or from DNA binding data, but so far has received little attention in this context. We detect a puzzling correlation between the number of TF binding sites in a miRNA promoter, and the number of binding sites for the miRNA in the 3' UTR of its predicted target. We suggest that this correlation may be the outcome of different types of FFLs. This, in turn, may shed light on the type of regulatory connection (positive or negative), as these can be deduced from the type of FFL in which they are involved. Results Comparison of miRNA-mediated and TF-mediated FFLs We start by comparing the functional properties of miRNA-mediated and TF-mediated FFLs. The general setting for this analysis is shown in Figure 1. The FFL is rooted at a TF (X) which responds to an outside signal. When active, it activates or inhibits the expression of both a downstream proteincoding gene (Z), and an additional regulatory element which may be a TF or a miRNA (Y; Figure 1). Importantly, the protein-coding target gene is also regulated by this additional regulatory element. The main interest in studying FFLs is the behavior and dynamics of the level of protein-coding gene target Z, which may be thought of as the "output" of the FFL. We used a modeling approach to examine three properties of FFLs: (i) "pulse generation" (Mangan and Alon, 2003) – a rapid rise in the level of Z, which is then followed by a gradual decrease to the steady state level; (ii) "response time" (Mangan and Alon, 2003) – the time it takes for Z to reach half its steady state level (or initial level, if the latter is zero). FFLs are known to either quicken or delay the response, depending on the specific architecture (Mangan and Alon, 2003); and (iii) "noise" – the coefficient of variance for the steady-state level of Z. Our modeling approach and choice of values for the reaction constants roughly follows that of Mangan and Alon (Mangan and Alon, 2003), and is depicted for the type-1 incoherent FFL in Figure 2 (also, see Methods). Figure 3 shows the dynamics of the level of Z resulting from these reaction schemes. Modeling of the type-3 and type-4 coherent FFLs is done similarly, and is described in the Supplementary Data. When comparing the properties of the two types of FFLs, we observed several possible advantages of miRNA-FFLs over TF-FFLs (Table 1; see Methods). First, miRNA-FFLs are RNA-based, rather than protein-based, and hence faster and more cost-effective (require less energy). Second, the pulse generation of type-1 incoherent FFLs is stronger in miRNA-FFLs. Third, the response time of both type1 incoherent FFLs and type-3 coherent FFLs, are shorter in miRNA-FFLs. Finally, the delayed response time (Mangan and Alon, 2003) of type-4 coherent FFLs is more effective in miRNA-FFLs. On the other hand, as shown in Table 1, miR-FFLs have a much noisier output. Using the Gillespie algorithm ((Gillespie, 2006); implemented in Dizzy (Ramsey, et al., 2005)), we simulated the level of the protein Z at steady state. We found that this level is noisier in the miRNA-FFL than in the TFmediated one, especially in the type-4 incoherent FFL and the type-3 coherent FFL. Taken together, these results may support the conjecture of Hornstein and Shomron (Hornstein and Shomron, 2006), that a likely architecture for miRNA regulation within the global network is a FFL. However, it suggests that a TF-based and a miR-based FFL may have different properties and are each optimal for different scenarios. While a TF-based FFL is more appropriate for noise reduction, the use of a miRNA component enhances other functionalities of the FFL, including pulse generation, response time and delayed response. Under plausible simplifying assumptions, the differential equations associated with the reaction schemes can be solved analytically, as described in the Supplementary Data. Identifying miRNA-FFLs in the D. melanogaster regulatory network The basic architecture of a FFL is as in Figure 1. However, this schematic depiction does not take into account the strength or cooperativity of the regulatory connections. To study this aspect, we compiled a genomic-scale list of miRNA-FFLs in D. melanogaster. We specifically focused on the number of TF and miRNA binding sites, which represents the strength of the regulatory connection. To this end, we first reconstructed the regulatory connections among TFs, miRNAs and protein-coding genes (see Methods). Briefly, this construction is based on all known 14226 protein-coding genes (downloaded from FlyBase (Crosby, et al., 2007)), the 78 known miRNAs (downloaded from miRBase (Griffiths-Jones, et al., 2006)), and 371 TF binding site motifs identified by Elemento and Tavazoie (Elemento and Tavazoie, 2005) (which we further clustered down to 321). Regulation by miRNAs was taken from miRBase; regulation by TFs was determined by searching for cis-regulatory modules (CRMs) encompassing clusters of TF binding sites within proximal promoters of protein-coding and miRNA genes (see Methods). A similar analysis was also performed for H. sapiens, based on the TF binding site motifs identified by Xie et al. (Xie, et al., 2005). The construction and resulting network will be discussed elsewhere (submitted), and here we focus on the strength of the regulatory connections within identified miRFFLs. Having identified all these regulatory connections, we obtained for each TF the list of miRNA-FFLs it is involved in. This is simply all miRNA-gene pairs, such that the miRNA targets the gene, and the TF has a binding site (within a CRM) in the promoter region of both the miRNA and the gene. All in all, we identified 3945 such FFLs. In addition, for each edge in the FFL, we noted its "strength" – the number of TF (or miRNA) binding sites within the promoter (or 3'-UTR). Correlated number of binding sites in FFLs We next examined the correlation between the number of TF binding sites in the miRNA promoter, and the number of binding sites for that miRNA in the 3' UTR of the downstream target genes. Formally, for a TF T, involved in k FFLs, we constructed two vectors, miR TF T v > − and gene miR T v > − , of dimension k. The ith entry of miR TF T v > − is the number of binding sites for T in the promoter region of the miRNA of the ith FFL, and the ith entry of gene miR T v > − is the number of binding sites for that miRNA in the 3' UTR of the downstream target gene (Figure 4a). We then computed the Spearman rank correlation between miR TF T v > − and gene miR T v > − , and the corresponding p-value. Of the 321 TFs, 189 participate in at most one FFL, and for an additional 98 one of the vector miR TF T v > − or gene miR T v > − is constant. In all these cases, the Spearman rank correlation between the vectors is undefined. However, as shown in Figure 4b, the remaining 34 TFs display a puzzling bi-modal distribution of correlation coefficients (Lilliefors normality test: α=0.01, p-value < 0.01), 16 of which are associated with a highly significant p-value (p-value < 10-3; false discovery rate = 0.0021). Moreover, this bimodel distribution and high number of significant correlations is also evident for FFLs derived from the H. Sapiens network (13 of 21 have p-value < 10-3). Interestingly, no significant correlation was found between the number of TF binding sites in the promoter region of the downstream gene and either miR TF T v > − or gene miR T v > − . What might be the reason for this bi-modality, and the seemingly significant correlations? A possible explanation is provided by considering the effect of multiple binding sites in the modeling procedure described above. Specifically, we compared the functional properties of the miRNA-FFLs when substituting the reaction ] [ y y XP P X ↔ + by ] 2 [ 2 y y XP P X ↔ + , and the reaction Y YZ Z Y → → + ] [ by Y YZ Z Y 2 ] 2 [ 2 → → + . Focusing on the example of the type-1 incoherent feed forward loop, our modeling suggests that two binding sites in the 3' UTR increases the strength of the pulse generated by the miRNA-FFL and shortens its response time. In contrast, two TF binding sites in the miRNA's promoter have little effect on functionality (Table 2), and thus one of them is likely to accumulate mutations and degenerate. This may imply that type-1 incoherent FFLs have a preference towards a specific pattern of binding sites – one TF binding site in the miRNA's promoter, and multiple miRNA binding sites in the gene's 3' UTR (i.e. miR TF T v > − =1 and gene miR T v > − >1). Such a preference will lead to a seemingly significant anti-correlation between miR TF T v > − and gene miR T v > − , over a set of type-1 incoherent FFLs. Intuitively, this is clear; in many FFLs the number of binding sites in the miRNA promoter will be one, while their number in the gene's 3' UTR will be high. More formally, suppose that among a set of 50 FFLs (which is indeed the average number of FFLs per TF in our data) 70% display the preferred pattern, while each of the other possible patterns is displayed in 10% of the FFLs. In this case, the Spearman rank correlation is,-0.375 (p=0.007). Furthermore, by generating such sets at random, we estimated that with probability 0.35, the p-value is less than 10-3. The same line of reasoning suggests that in sets of type-3 coherent FFLs a similar negative correlation will tend to appear, while in sets of type-4 coherent FFLs a positive correlation will tend to appear (see Supplementary Data). Taken together, this may account for both the seemingly significant correlations among binding sites numbers, and their bi-modal distribution. As an example, we looked for TF motifs from (Elemento and Tavazoie, 2005) which match (according to TransFac (Matys, et al., 2006)) the recognition site of the TF Ftz, a known activator, or of Snail, known to act both as an activator and inhibitor. Indeed, we find that Ftz is associated with 3 motifs with a defined correlation between miR TF T v > − and gene miR T v > − , all three of which are negative, as is expected in a type-1 incoherent FFL, where the TF acts solely as an activator. Similarly, we find that the 9 motifs associated with Snail display both negative (5 motifs) and positive (4 motifs) correlations, compatible with the behavior suggested for the incoherent FFLs, where the TF acts as activator of one element, and an inhibitor of the other. An implicit assumption here is that FFLs rooted at the same TF will tend to be of the same type. This seems plausible according to data from E. Coli compiled by Shen-Orr et al. (Shen-Orr, et al., 2002), where all five TFs that are involved in multiple FFLs, display a preference for one type of FFL and regulatory connection (negative or positive), with which the majority of the FFLs comply (Table 3). Discussion and Conclusions In this work we analyzed the functional properties of FFLs in which the mediating regulatory element is a miRNA. We have shown that although the reaction scheme for such FFLs is very similar to that of TFmediated ones, the dynamics of the downstream protein product is very different. Our analysis also supports a previously suggested conjecture that miRNAs tend to be involved in FFLs (Hornstein and Shomron, 2006). However, the arising explanation is different: miRNA FFLs enhance the functionality of the FFL rather than reduce noise at the protein level. Studying the strength of the regulatory connections, we observed that in FFLs rooted at the same TF there tends to be a seemingly significant positive or negative correlation between the number of binding site in the miRNA promoter region and the number of binding sites for the miRNA in the 3' UTR of the joint target gene. A possible explanation for this is suggested by modeling the effect of multiple binding sites on the FFL functionality: Different coherency and architecture of FFLs may dictate a preferred arrangement of binding site numbers, leading to these seemingly significant correlations, as well as to the observed bi-model distribution of correlation coefficients. The contribution of the latter observation is two-fold. First, we explore the impact of multiple binding sites on FFL functionality, highlighting certain binding-site arrangements as more likely for a given type of FFL. Second, this analysis suggests that even in sequence-based constructions of regulatory network, it is possible, in specific cases, to predict negative or positive regulation. Namely, when a regulatory connection is part of a FFL, whose specific type is suggested by analyzing the correlation between binding site numbers. For example, TFs involved mainly in type-1 incoherent FFLs would be predicted to confer mainly positive regulation. Notably, a broader integration of other network properties are required for a refined characterization of regulatory interactions. Differentiating between positive and negative correlations is not sufficient, as this distinction only partitions the FFLs into two sets. Identifying the exact type of FFL is required in order to elucidate the type of all FFL-integrated regulatory connections. Future work will hopefully incorporate properties such as the value of the correlation coefficient, or the actual number of binding sites, to achieve this goal.
منابع مشابه
A Curated Database of miRNA Mediated Feed-Forward Loops Involving MYC as Master Regulator
BACKGROUND The MYC transcription factors are known to be involved in the biology of many human cancer types. But little is known about the Myc/microRNAs cooperation in the regulation of genes at the transcriptional and post-transcriptional level. METHODOLOGY/PRINCIPAL FINDINGS Employing independent databases with experimentally validated data, we identified several mixed microRNA/Transcriptio...
متن کاملTrade-offs and Noise Tolerance in Signal Detection by Genetic Circuits
Genetic circuits can implement elaborated tasks of amplitude or frequency signal detection. What type of constraints could circuits experience in the performance of these tasks, and how are they affected by molecular noise? Here, we consider a simple detection process-a signal acting on a two-component module-to analyze these issues. We show that the presence of a feedback interaction in the de...
متن کاملModeling SMA actuated systems based on Bouc-Wen hysteresis model and feed-forward neural network
Despite the fact that shape-memory alloy (SMA) has several mechanical advantages as it continues being used as an actuator in engineering applications, using it still remains as a challenge since it shows both non-linear and hysteretic behavior. To improve the efficiency of SMA application, it is required to do research not only on modeling it, but also on control hysteresis behavior of these m...
متن کاملPHA-4/FOXA-regulated microRNA feed forward loops during Caenorhabditis elegans dietary restriction
Dietary restriction (DR) increases life span and delays the onset of age-related diseases across species. However, the molecular mechanisms have remained relatively unexplored in terms of gene regulation. InC. elegans, a popular model for aging studies, the FOXA transcription factor PHA-4 is a robust genetic regulator of DR, although little is known about how it regulates gene expression. We pr...
متن کاملGlobal Solar Radiation Prediction for Makurdi, Nigeria Using Feed Forward Backward Propagation Neural Network
The optimum design of solar energy systems strongly depends on the accuracy of solar radiation data. However, the availability of accurate solar radiation data is undermined by the high cost of measuring equipment or non-functional ones. This study developed a feed-forward backpropagation artificial neural network model for prediction of global solar radiation in Makurdi, Nigeria (7.7322 N lo...
متن کاملNoise in Feed-forward Loops for Galactose Utilization
In this chapter we study two naturally occurring feed-forward loops that are involved in galactose metabolism and transport. Despite having network structures that are capable of a producing dynamic, temporally diverse responses we find, by measuring dynamic noise correlations, that in their natural context these feed-forward loops are inactive. By perturbing genetic conditions the activity can...
متن کامل